Distributed file system logging

ABSTRACT

A method, system, and computer program for consolidating data logged in log files in a network of servers, each server running at least one application that logs data into files on the server, the method comprising: providing a consolidating message queue for receiving the log data and file name; intercepting log data being written into a log file by a file system and sending that log data and the file name of the log file to a consolidating message queue; receiving the log data and file name in a consolidating message queue; and saving the log data in the consolidating message queue from all the servers to a consolidated file or data structure associated with the file name.

FOREIGN APPLICATION PRIORITY DATA

This application claims benefit of priority of Foreign PatentApplication No. GB 09167985.2, filed in the United Kingdom on Aug. 17,2009, which is herein incorporated by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates to a distributed file system logging system andmethod. In particular this invention relates to a distributed filesystem logging system and method that intercepts file system writes tolog files.

2. Description of the Related Art

In a mass virtual hosting Web cluster, a Web application can run as manyinstances on many Web Servers or virtual Web Servers in the Web cluster.Each instance of a Web application will write status data to at leastone log file resulting in many log files in many places and acomplicated developer review. The number of log files can easily becometoo large for quick review and some consolidation is required. Severalsolutions have been proposed to consolidate the log files.

U.S. Pat. No. 7,356,590, filed by the Visible Measures Corporation,discloses a distributed capture and aggregation of dynamic applicationusage information that uses an application tracking method to access andview aggregated log files. The application tracking method sits in anapplication layer and listens for events and forwards the events to anaggregated log file.

U.S. Patent Application No. 2002/0087949, filed by Golender, V, et al,discloses a system and method for software diagnostics using acombination of visual and dynamic tracing and comprises informationgathering modules. Each information gathering module gathers user inputand output and consolidates such input and output for diagnostics.

U.S. Pat. No. 6,298,386, filed by the EMC Corporation, discloses anetwork file server having a message collector queue for connection andconnectionless oriented protocols. Connection messages are interceptedand collected in the collector queue.

All of the above solutions rely on changes to be made to the applicationrunning on the server so when a new application is loaded then changesmust be made.

Another known method of consolidating the log files is to perform batchmerging of the log files when the web application is idle. However, dueto the interactive nature of developing/debugging web applications andfollowing log files, developers need near real time access to the logfiles. If the Web application is not idle for long periods then batchmerging is always out of date.

In normally secure applications, developers are not allowed interactiveaccess to the web servers. Furthermore, if the Web cluster had a loadbalancing mechanism then it would be difficult to find which Web servera particular debugging session had used.

Network solutions where the log file is saved remotely in a consolidatednetwork place tend to be unreliable and are not preferred since logfiles are required for audit purposes.

Applications that solve this problem by logging to a database need tounderstand the large number of tables and database connections thatwould be needed in a mass virtual hosting application such as thisprevent this approach.

It is also possible for Web servers to pipe access logs into processesbut this is not the case for error logs.

U.S. Patent Application No. 2006/025373, filed by the EMC Corporation,discloses backing up selected files of a computer system by using amirroring driver attached to file system driver to create back-up copiesof data files concurrently with changes to data files. However, thisdisclosure does not address the problem of consolidating data logs.

SUMMARY OF THE INVENTION

According to one aspect of the invention there is provided a distributedfile system logging method.

According to another aspect of the invention there is provided adistributed file system logging system.

According to another aspect of the invention there is provided adistributed file system logging computer program product.

By combining guaranteed once-and-only once messaging, a chosen filesystem, and a simple daemon, logs are written to a flat file (as far asthe web server can see) with no special logging configured. Under thecovers, the file system is actually intercepting the low level fileoperations and placing complete log file lines onto a queue. The queueis a remote definition of a queue on another server. This queue is thenread by a remote logging server, which places the log lines from all theother servers into the appropriate files.

Preferably, only entire log lines are sent to the message queue. Bysending only entire lines, logs are not susceptible to interleaving asmay happen with naive network implementations.

More preferably a local message queue is used in each Web server and aconsolidating message queue consolidates all the messages.

Any service from any number of machines can now receive aggregate logswith no code changes or other modifications. Should the remote loggingserver be down or not running, log lines are not lost.

There is a single additional process on each machine regardless of thenumber of applications/virtual hosts that use this logging solution.

For the most part log lines will arrive in chronological order at theremote end but for absolute chronological order a timestamp on messagein the queue would serialize them in the appropriate order.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will now be described, by means of exampleonly, with reference to the accompanying drawings in which:

FIG. 1 is a schematic of a distributed file system logging systemaccording to the present embodiment; and

FIG. 2 is a schematic of a distributed file system logging method of thepreferred embodiment of the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT System Architecture

FIG. 1 shows the preferred embodiment in a cluster of Web servers. Thepreferred embodiment is consolidating system 100 that consolidates logfiles on more than one server 102A and 102B, in a network. Each server102A and 102B, runs at least one application 104AA, 104AB, 104BA, or104BB that logs data into a log file 106AA, 106AB, 106BA, or 106BB onthat server. The servers 102A and 102B are also comprised of a filesystem interceptor 108A and 108B, and a local message queue 110A and110B. Consolidating system 100 may also contain a consolidating messagequeue 112, consolidating logging daemon 114, and consolidating log file116.

Each file system interceptor 108A, 108B is an interceptor of file systemwrite operations located on each server and configured so that when logdata is written into a log file then that log data, the file name of thelog file and the server name is sent to the local message queue andsubsequently to the consolidating message queue. In the preferredembodiment a timestamp is additionally sent to the message queue andwhereby the merged view can sort the log data in absolute chronologicalorder although for general purposes the log data would be received inroughly the correct chronological order.

In this embodiment the file system interceptor is part of a virtual filesystem implemented in Linux using a loadable kernel module; the loadablekernel module allows non-privileged users to create their own filesystems without editing the kernel code. One particularly useful kernelmodule is called Filesystem in Userspace (FUSE), however, an interceptorcould be implemented in any kernel module or any file system for anyoperating system. When starting the virtual file system, it is necessaryto make sure the consolidating message queue and the address of themessage queue is identified so that the interceptor can be configured tosend messages to the consolidating message queue at the message queueaddress. This configuration step need only be taken once so that thevirtual file system component is mounted in the standard manner on eachWeb server.

The Web servers are configured to write their log files into a directorywithin this mount point. After configuration, the interceptor interceptsthe minimal number of system calls needed to fool Web applicationswriting to the file system that the file system is responding asexpected. File system read operations are not required in a loggingscenario, so the interceptor needs to act as write only. By default, thefile system is empty, however, over the course of time it willacknowledge the existence of files that have been written to during itslifetime, to aid system administration as much as anything. It makes noattempt to resemble the state of the files on the remote end, as thisfunctionality is not needed for a logging solution. Each write call isscanned for complete lines. Once a complete logging line is found (orthe file is closed), the file system puts a message onto the localmessage queue, consisting of the file name, server name and the log fileentry. The message is then sent over the network to the address of theconsolidating message queue and the logging daemon.

Each local message queue system 110A, 110B, is used in a preferredembodiment for greater reliability but a single message queue at theconsolidating end would work effectively in normal operating conditions.The local message queue ensures that a message will be sent and that thedata logging is completed in a very secure and reliable manner.

The consolidating message queue 112 is for receiving the log data, filename, server name and time stamp from the Web servers in the network. Anexample of one message queue is IBM WebSphere MQ; another example of amessage queue is the Java Messaging Service. IBM, MQ, and WebSphere areregistered trademarks of International Business Machines in the UnitedStates and/or other countries. Java is a trademark of Sun Microsystemsin the United States and/or other countries.

The logging daemon 114 saves the log data, file name and server name inconsolidating log file 116 and provides a merged view of the log data,file name, and server name from the saved log data, file name, andserver name. In the preferred embodiment the merged view is provided inreal time while server is running the at least one application. Thelogging daemon can run on the same platform as the message queue. Thedaemon listens for messages, receives the logging data, and writes(creating files if necessary) the log entry into a file of the file namegiven in the message. The logging daemon then flushes the file to disk.No special consideration is given to security in this instance, thoughthrough appropriate use of firewalls, network encryption, authenticationetc. sensitive information can be protected and considered genuine.

System Method

The method of the preferred embodiment is described with reference toFIG. 2. In the network of servers, each server runs at least oneapplication that logs data into files on the server, the method ofconsolidating the data logged into the log files on the servercomprising: intercepting file system commands writing log data to a logfile (step 202); receiving the log data and file name at a local messagequeue (step 204); receiving the log data and file name at theconsolidating message queue (step 206); saving, by the logging daemon,the log data to a consolidated file associated with file name (step208); and providing a merged view of the logging data grouped by server(step 210).

In step 202, log data written into a log file (the log data, the filename of the log file and the server name) is intercepted and sent to alocal message queue on the Web server. In another embodiment a timestampis additionally sent to the local message queue.

In step 204, the log data, file name and server name is received in thelocal message queue from the interceptor, and subsequently sent to theconsolidating message queue.

In step 206, the log data, file name and server name is received in theconsolidated queue from multiple Web applications running on multipleWeb servers.

In step 208, the log data, file name and server name is saved by thelogging daemon. Each separate log data is saved in the same order sothat a sequential list of all consolidated log data is built up in theconsolidated log file.

In step 210, a merged view of the log data, file name, and server nameis provided from the saved log data, file name, and server name. In thepreferred embodiment the merged view is provided in real time whileserver is running the at least one application. In the timestampembodiment, the merged view can sort the log data in absolutechronological order.

Other Embodiments

It will be clear to one of ordinary skill in the art that all or part ofthe method of the preferred embodiments of the present invention maysuitably and usefully be embodied in a logic apparatus, or a pluralityof logic apparatus, comprising logic elements arranged to perform thesteps of the method, and that such logic elements may comprise hardwarecomponents, firmware components, or a combination thereof.

It will be equally clear to one of skill in the art that all or part ofa logic arrangement according to the preferred embodiments of thepresent invention may suitably be embodied in a logic apparatuscomprising logic elements to perform the steps of the method, and thatsuch logic elements may comprise components such as logic gates in, forexample a programmable logic array or application-specific integratedcircuit. Such a logic arrangement may further be embodied in enablingelements for temporarily or permanently establishing logic structures insuch an array or circuit using, for example, a virtual hardwaredescriptor language, which may be stored and transmitted using fixed ortransmittable carrier media.

It will be appreciated that the method and arrangement described abovemay also suitably be carried out fully or partially in software runningon one or more processors (not shown in the figures), and that thesoftware may be provided in the form of one or more computer programelements carried on any suitable data-carrier (also not shown in thefigures) such as a magnetic or optical disk or the like. Channels forthe transmission of data may likewise comprise storage media of alldescriptions as well as signal-carrying media, such as wired or wirelesssignal-carrying media.

The present invention may further suitably be embodied as a computerprogram product for use with a computer system. Such an implementationmay comprise a series of computer-readable instructions either fixed ona tangible medium, such as a computer readable medium, for example,diskette, CD-ROM, ROM, or hard disk, or transmittable to a computersystem, using a modem or other interface device, over either a tangiblemedium, including but not limited to optical or analogue communicationslines, or intangibly using wireless techniques, including but notlimited to microwave, infrared or other transmission techniques. Theseries of computer readable instructions embodies all or part of thefunctionality previously described herein.

Those skilled in the art will appreciate that such computer readableinstructions can be written in a number of programming languages for usewith many computer architectures or operating systems. Further, suchinstructions may be stored using any memory technology, present orfuture, including but not limited to, semiconductor, magnetic, oroptical, or transmitted using any communications technology, present orfuture, including but not limited to optical, infrared, or microwave. Itis contemplated that such a computer program product may be distributedas a removable medium with accompanying printed or electronicdocumentation, for example, shrink-wrapped software, pre-loaded with acomputer system, for example, on a system ROM or fixed disk, ordistributed from a server or electronic bulletin board over a network,for example, the Internet or World Wide Web.

In an alternative, the preferred embodiment of the present invention maybe realized in the form of a computer implemented method of deploying aservice comprising steps of deploying computer program code operable to,when deployed into a computer infrastructure and executed thereon, causethe computer system to perform all the steps of the method.

In a further alternative, the preferred embodiment of the presentinvention may be realized in the form of a data carrier havingfunctional data thereon, the functional data comprising functionalcomputer data structures to, when loaded into a computer system andoperated upon thereby, enable the computer system to perform all thesteps of the method. It will be clear to one skilled in the art thatmany improvements and modifications can be made to the foregoingexemplary embodiment without departing from the scope of the presentinvention.

1. A method of consolidating log files comprising: intercepting log databeing written into a log file by a file system and sending the log dataand a file name of the log file to a consolidating message queue;receiving the log data and the file name of the log file in theconsolidating message queue; and saving the log data to a consolidatedfile associated with the file name of the log file.
 2. The methodaccording to claim 1, further comprising sending a server name with thelog data; saving the server name with the log data; and providing amerged view of the log data with the server name.
 3. The methodaccording to claim 1, wherein a timestamp is sent to the message queueand whereby the saved log data can be sorted in absolute chronologicalorder.
 4. The method according to claim 1, wherein consolidated log datais provided in real time while a server is running at least oneapplication.
 5. A system of consolidating data logged in log files onmore than one server in a network, each server running at least oneapplication and logs data into log files on each server, the systemcomprising: an interceptor in each server for intercepting log datawritten into a log file and placing the log data and a file name of thelog file on a message queue; a message queue for receiving andconsolidating the log data and the file name of the log file fromservers in the network; and a file saver for saving the log data in aconsolidated file associated with the file name of the log file.
 6. Thesystem according to claim 5, further comprising the interceptorintercepting and sending a server name with the log data to the messagequeue; the file saver saving the server name with the log data; and afile viewer for providing a merged view of the log data with servername.
 7. The system according to claim 5, wherein a timestamp is sent tothe message queue and whereby the merged view can sort the log data inchronological order.
 8. The system according to claim 5, wherein amerged view is provided in real time while server is running the atleast one application.
 9. A computer program product comprising adata-carrier having computer readable code stored thereon forconsolidating data logged in log files in a network of servers, eachserver running at least one application that logs data into files on theserver, the computer readable code which when loaded onto a computersystem and executed performs the following steps: providing aconsolidating message queue for receiving the log data and file name;intercepting log data being written into a log file by a file system andsending that log data and the file name of the log file to aconsolidating message queue; receiving the log data and file name in aconsolidating message queue; and saving the log data in theconsolidating message queue from all the servers to a consolidated fileor data structure associated with the file name.
 10. The computerprogram product according to claim 9, further comprising sending aserver name with the log data; saving the server name with the log data;and providing a merged view of the log data with server name.
 11. Thecomputer program product according to claim 9, wherein a timestamp issent to the message queue and whereby the saved log data can be sortedin absolute chronological order.
 12. The computer program productaccording to claim 9, wherein a merged view is provided in real timewhile server is running at least one application.
 13. The methodaccording to claim 2, wherein a timestamp is sent to the message queueand whereby, in response to saving the log data to a consolidated file,sorting the saved log data in absolute chronological order.
 14. Themethod according to claim 2, wherein the consolidated log data isprovided in real time while the server is running at least oneapplication.
 15. The system according to claim 6, wherein a timestamp issent to the message queue and whereby the merged view can sort the logdata in chronological order.
 16. The system according to claim 6,wherein the merged view is provided in real time while the server isrunning at least one application.
 17. The system according to claim 15,wherein the merged view is provided in real time while the server isrunning at least one application.
 18. The computer program productaccording to claim 10, wherein a timestamp is sent to the message queueand whereby in response to saving the log data to a consolidated file,sorting the log data in absolute chronological order.
 19. The computerprogram product according to claim 18, wherein the merged view isprovided in real time while the server is running at least oneapplication.
 20. The computer program product according to claim 10,wherein the merged view is provided in real time while the server isrunning at least one application.